Overview

Dataset statistics

Number of variables15
Number of observations99003
Missing cells177
Missing cells (%)< 0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory16.4 MiB
Average record size in memory173.8 B

Variable types

NUM14
CAT1

Reproduction

Analysis started2020-09-17 08:28:39.656707
Analysis finished2020-09-17 08:29:27.915768
Duration48.26 seconds
Versionpandas-profiling v2.7.1
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml
dob_year is highly correlated with ageHigh correlation
age is highly correlated with dob_yearHigh correlation
mobile_likes_received is highly correlated with likes_receivedHigh correlation
likes_received is highly correlated with mobile_likes_received and 1 other fieldsHigh correlation
www_likes_received is highly correlated with likes_receivedHigh correlation
likes_received is highly skewed (γ1 = 112.0745682) Skewed
mobile_likes_received is highly skewed (γ1 = 107.5312999) Skewed
www_likes_received is highly skewed (γ1 = 126.257317) Skewed
userid has unique values Unique
friend_count has 1962 (2.0%) zeros Zeros
friendships_initiated has 2997 (3.0%) zeros Zeros
likes has 22308 (22.5%) zeros Zeros
likes_received has 24428 (24.7%) zeros Zeros
mobile_likes has 35056 (35.4%) zeros Zeros
mobile_likes_received has 30003 (30.3%) zeros Zeros
www_likes has 60999 (61.6%) zeros Zeros
www_likes_received has 36864 (37.2%) zeros Zeros

Variables

userid
Real number (ℝ≥0)

UNIQUE
Distinct count99003
Unique (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1597045.2079128916
Minimum1000008
Maximum2193542
Zeros0
Zeros (%)0.0%
Memory size773.6 KiB

Quantile statistics

Minimum1000008
5-th percentile1060618.3
Q11298805.5
median1596148
Q31895744
95-th percentile2133357.1
Maximum2193542
Range1193534
Interquartile range (IQR)596938.5

Descriptive statistics

Standard deviation344059.1775
Coefficient of variation (CV)0.2154348391
Kurtosis-1.199556831
Mean1597045.208
Median Absolute Deviation (MAD)298438
Skewness0.0001076605667
Sum1.581122667e+11
Variance1.183767176e+11
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
1159224 1 < 0.1%
 
1129202 1 < 0.1%
 
1055510 1 < 0.1%
 
1855227 1 < 0.1%
 
2110369 1 < 0.1%
 
1991449 1 < 0.1%
 
2128666 1 < 0.1%
 
1884335 1 < 0.1%
 
2082123 1 < 0.1%
 
1026848 1 < 0.1%
 
Other values (98993) 98993 > 99.9%
 
ValueCountFrequency (%) 
1000008 1 < 0.1%
 
1000013 1 < 0.1%
 
1000015 1 < 0.1%
 
1000038 1 < 0.1%
 
1000059 1 < 0.1%
 
ValueCountFrequency (%) 
2193542 1 < 0.1%
 
2193538 1 < 0.1%
 
2193522 1 < 0.1%
 
2193499 1 < 0.1%
 
2193485 1 < 0.1%
 

age
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count101
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean37.28022383160106
Minimum13
Maximum113
Zeros0
Zeros (%)0.0%
Memory size773.6 KiB

Quantile statistics

Minimum13
5-th percentile15
Q120
median28
Q350
95-th percentile90
Maximum113
Range100
Interquartile range (IQR)30

Descriptive statistics

Standard deviation22.58974831
Coefficient of variation (CV)0.6059445462
Kurtosis1.561446767
Mean37.28022383
Median Absolute Deviation (MAD)10
Skewness1.415260654
Sum3690854
Variance510.2967289
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
18 5196 5.2%
 
23 4404 4.4%
 
19 4391 4.4%
 
20 3769 3.8%
 
21 3671 3.7%
 
25 3641 3.7%
 
17 3283 3.3%
 
16 3086 3.1%
 
22 3032 3.1%
 
24 2827 2.9%
 
Other values (91) 61703 62.3%
 
ValueCountFrequency (%) 
13 484 0.5%
 
14 1925 1.9%
 
15 2618 2.6%
 
16 3086 3.1%
 
17 3283 3.3%
 
ValueCountFrequency (%) 
113 202 0.2%
 
112 18 < 0.1%
 
111 18 < 0.1%
 
110 15 < 0.1%
 
109 9 < 0.1%
 

dob_day
Real number (ℝ≥0)

Distinct count31
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14.53040816944941
Minimum1
Maximum31
Zeros0
Zeros (%)0.0%
Memory size773.6 KiB

Quantile statistics

Minimum1
5-th percentile1
Q17
median14
Q322
95-th percentile29
Maximum31
Range30
Interquartile range (IQR)15

Descriptive statistics

Standard deviation9.015606359
Coefficient of variation (CV)0.6204647697
Kurtosis-1.188960111
Mean14.53040817
Median Absolute Deviation (MAD)8
Skewness0.1078407568
Sum1438554
Variance81.28115802
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
1 7900 8.0%
 
10 4030 4.1%
 
15 3555 3.6%
 
5 3545 3.6%
 
12 3413 3.4%
 
2 3409 3.4%
 
3 3291 3.3%
 
17 3266 3.3%
 
20 3263 3.3%
 
14 3219 3.3%
 
Other values (21) 60112 60.7%
 
ValueCountFrequency (%) 
1 7900 8.0%
 
2 3409 3.4%
 
3 3291 3.3%
 
4 3217 3.2%
 
5 3545 3.6%
 
ValueCountFrequency (%) 
31 1507 1.5%
 
30 2530 2.6%
 
29 2508 2.5%
 
28 2955 3.0%
 
27 2755 2.8%
 

dob_year
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count101
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1975.719776168399
Minimum1900
Maximum2000
Zeros0
Zeros (%)0.0%
Memory size773.6 KiB

Quantile statistics

Minimum1900
5-th percentile1923
Q11963
median1985
Q31993
95-th percentile1998
Maximum2000
Range100
Interquartile range (IQR)30

Descriptive statistics

Standard deviation22.58974831
Coefficient of variation (CV)0.01143368032
Kurtosis1.561446767
Mean1975.719776
Median Absolute Deviation (MAD)10
Skewness-1.415260654
Sum195602185
Variance510.2967289
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
1995 5196 5.2%
 
1990 4404 4.4%
 
1994 4391 4.4%
 
1993 3769 3.8%
 
1992 3671 3.7%
 
1988 3641 3.7%
 
1996 3283 3.3%
 
1997 3086 3.1%
 
1991 3032 3.1%
 
1989 2827 2.9%
 
Other values (91) 61703 62.3%
 
ValueCountFrequency (%) 
1900 202 0.2%
 
1901 18 < 0.1%
 
1902 18 < 0.1%
 
1903 15 < 0.1%
 
1904 9 < 0.1%
 
ValueCountFrequency (%) 
2000 484 0.5%
 
1999 1925 1.9%
 
1998 2618 2.6%
 
1997 3086 3.1%
 
1996 3283 3.3%
 

dob_month
Real number (ℝ≥0)

Distinct count12
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.283365150550994
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Memory size773.6 KiB

Quantile statistics

Minimum1
5-th percentile1
Q13
median6
Q39
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.529671569
Coefficient of variation (CV)0.5617485987
Kurtosis-1.240397572
Mean6.283365151
Median Absolute Deviation (MAD)3
Skewness0.03129550742
Sum622072
Variance12.45858138
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
1 11772 11.9%
 
10 8476 8.6%
 
5 8271 8.4%
 
8 8266 8.3%
 
3 8110 8.2%
 
7 8021 8.1%
 
9 7939 8.0%
 
12 7894 8.0%
 
4 7810 7.9%
 
2 7632 7.7%
 
Other values (2) 14812 15.0%
 
ValueCountFrequency (%) 
1 11772 11.9%
 
2 7632 7.7%
 
3 8110 8.2%
 
4 7810 7.9%
 
5 8271 8.4%
 
ValueCountFrequency (%) 
12 7894 8.0%
 
11 7205 7.3%
 
10 8476 8.6%
 
9 7939 8.0%
 
8 8266 8.3%
 

gender
Categorical

Distinct count2
Unique (%)< 0.1%
Missing175
Missing (%)0.2%
Memory size773.6 KiB
male
58574
female
40254
ValueCountFrequency (%) 
male 58574 59.2%
 
female 40254 40.7%
 
(Missing) 175 0.2%
 

Length

Max length6
Mean length4.811419856
Min length3
ValueCountFrequency (%) 
Lowercase_Letter 6 100.0%
 
ValueCountFrequency (%) 
Latin 6 100.0%
 
ValueCountFrequency (%) 
ASCII 6 100.0%
 

tenure
Real number (ℝ≥0)

Distinct count2426
Unique (%)2.5%
Missing2
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean537.8873748750012
Minimum0.0
Maximum3139.0
Zeros70
Zeros (%)0.1%
Memory size773.6 KiB

Quantile statistics

Minimum0
5-th percentile47
Q1226
median412
Q3675
95-th percentile1575
Maximum3139
Range3139
Interquartile range (IQR)449

Descriptive statistics

Standard deviation457.6498739
Coefficient of variation (CV)0.8508284359
Kurtosis2.199058275
Mean537.8873749
Median Absolute Deviation (MAD)213
Skewness1.535680925
Sum53251388
Variance209443.4071
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
300 173 0.2%
 
303 170 0.2%
 
242 164 0.2%
 
272 163 0.2%
 
257 161 0.2%
 
297 161 0.2%
 
285 160 0.2%
 
280 160 0.2%
 
284 158 0.2%
 
278 158 0.2%
 
Other values (2416) 97373 98.4%
 
ValueCountFrequency (%) 
0 70 0.1%
 
1 60 0.1%
 
2 72 0.1%
 
3 79 0.1%
 
4 86 0.1%
 
ValueCountFrequency (%) 
3139 3 < 0.1%
 
3129 1 < 0.1%
 
3128 1 < 0.1%
 
3101 1 < 0.1%
 
3019 1 < 0.1%
 

friend_count
Real number (ℝ≥0)

ZEROS
Distinct count2562
Unique (%)2.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean196.3507873498783
Minimum0
Maximum4923
Zeros1962
Zeros (%)2.0%
Memory size773.6 KiB

Quantile statistics

Minimum0
5-th percentile3
Q131
median82
Q3206
95-th percentile720
Maximum4923
Range4923
Interquartile range (IQR)175

Descriptive statistics

Standard deviation387.304229
Coefficient of variation (CV)1.972511719
Kurtosis50.09427289
Mean196.3507873
Median Absolute Deviation (MAD)64
Skewness6.059008484
Sum19439317
Variance150004.5658
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0 1962 2.0%
 
1 1816 1.8%
 
2 1117 1.1%
 
3 860 0.9%
 
5 789 0.8%
 
4 749 0.8%
 
10 737 0.7%
 
24 732 0.7%
 
6 720 0.7%
 
29 719 0.7%
 
Other values (2552) 88802 89.7%
 
ValueCountFrequency (%) 
0 1962 2.0%
 
1 1816 1.8%
 
2 1117 1.1%
 
3 860 0.9%
 
4 749 0.8%
 
ValueCountFrequency (%) 
4923 1 < 0.1%
 
4917 1 < 0.1%
 
4863 1 < 0.1%
 
4845 1 < 0.1%
 
4844 1 < 0.1%
 

friendships_initiated
Real number (ℝ≥0)

ZEROS
Distinct count1519
Unique (%)1.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean107.45247113723826
Minimum0
Maximum4144
Zeros2997
Zeros (%)3.0%
Memory size773.6 KiB

Quantile statistics

Minimum0
5-th percentile1
Q117
median46
Q3117
95-th percentile418
Maximum4144
Range4144
Interquartile range (IQR)100

Descriptive statistics

Standard deviation188.786951
Coefficient of variation (CV)1.756934475
Kurtosis42.53560096
Mean107.4524711
Median Absolute Deviation (MAD)36
Skewness5.150757415
Sum10638117
Variance35640.51287
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0 2997 3.0%
 
1 2212 2.2%
 
2 1551 1.6%
 
3 1355 1.4%
 
4 1352 1.4%
 
6 1328 1.3%
 
5 1328 1.3%
 
11 1319 1.3%
 
8 1314 1.3%
 
13 1279 1.3%
 
Other values (1509) 82968 83.8%
 
ValueCountFrequency (%) 
0 2997 3.0%
 
1 2212 2.2%
 
2 1551 1.6%
 
3 1355 1.4%
 
4 1352 1.4%
 
ValueCountFrequency (%) 
4144 1 < 0.1%
 
3654 1 < 0.1%
 
3594 1 < 0.1%
 
3538 1 < 0.1%
 
3415 1 < 0.1%
 

likes
Real number (ℝ≥0)

ZEROS
Distinct count2924
Unique (%)3.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean156.07878549134875
Minimum0
Maximum25111
Zeros22308
Zeros (%)22.5%
Memory size773.6 KiB

Quantile statistics

Minimum0
5-th percentile0
Q11
median11
Q381
95-th percentile726
Maximum25111
Range25111
Interquartile range (IQR)80

Descriptive statistics

Standard deviation572.2806808
Coefficient of variation (CV)3.666614134
Kurtosis200.4456878
Mean156.0787855
Median Absolute Deviation (MAD)11
Skewness11.02370356
Sum15452268
Variance327505.1777
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0 22308 22.5%
 
1 6928 7.0%
 
2 4434 4.5%
 
3 3240 3.3%
 
4 2507 2.5%
 
5 2027 2.0%
 
6 1806 1.8%
 
7 1618 1.6%
 
8 1430 1.4%
 
9 1381 1.4%
 
Other values (2914) 51324 51.8%
 
ValueCountFrequency (%) 
0 22308 22.5%
 
1 6928 7.0%
 
2 4434 4.5%
 
3 3240 3.3%
 
4 2507 2.5%
 
ValueCountFrequency (%) 
25111 1 < 0.1%
 
21652 1 < 0.1%
 
16732 1 < 0.1%
 
16583 1 < 0.1%
 
14799 1 < 0.1%
 

likes_received
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED
ZEROS
Distinct count2681
Unique (%)2.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean142.6893629485975
Minimum0
Maximum261197
Zeros24428
Zeros (%)24.7%
Memory size773.6 KiB

Quantile statistics

Minimum0
5-th percentile0
Q11
median8
Q359
95-th percentile561
Maximum261197
Range261197
Interquartile range (IQR)58

Descriptive statistics

Standard deviation1387.919613
Coefficient of variation (CV)9.726861091
Kurtosis17384.94
Mean142.6893629
Median Absolute Deviation (MAD)8
Skewness112.0745682
Sum14126675
Variance1926320.851
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0 24428 24.7%
 
1 7305 7.4%
 
2 4541 4.6%
 
3 3347 3.4%
 
4 2669 2.7%
 
5 2373 2.4%
 
6 1873 1.9%
 
7 1680 1.7%
 
8 1538 1.6%
 
9 1351 1.4%
 
Other values (2671) 47898 48.4%
 
ValueCountFrequency (%) 
0 24428 24.7%
 
1 7305 7.4%
 
2 4541 4.6%
 
3 3347 3.4%
 
4 2669 2.7%
 
ValueCountFrequency (%) 
261197 1 < 0.1%
 
178166 1 < 0.1%
 
152014 1 < 0.1%
 
106025 1 < 0.1%
 
82623 1 < 0.1%
 

mobile_likes
Real number (ℝ≥0)

ZEROS
Distinct count2396
Unique (%)2.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean106.11629950607558
Minimum0
Maximum25111
Zeros35056
Zeros (%)35.4%
Memory size773.6 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median4
Q346
95-th percentile481.9
Maximum25111
Range25111
Interquartile range (IQR)46

Descriptive statistics

Standard deviation445.2529851
Coefficient of variation (CV)4.195896268
Kurtosis360.9885806
Mean106.1162995
Median Absolute Deviation (MAD)4
Skewness14.16123656
Sum10505832
Variance198250.2207
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0 35056 35.4%
 
1 6297 6.4%
 
2 3941 4.0%
 
3 2917 2.9%
 
4 2265 2.3%
 
5 1794 1.8%
 
6 1598 1.6%
 
7 1395 1.4%
 
8 1212 1.2%
 
9 1149 1.2%
 
Other values (2386) 41379 41.8%
 
ValueCountFrequency (%) 
0 35056 35.4%
 
1 6297 6.4%
 
2 3941 4.0%
 
3 2917 2.9%
 
4 2265 2.3%
 
ValueCountFrequency (%) 
25111 1 < 0.1%
 
21652 1 < 0.1%
 
16732 1 < 0.1%
 
14039 1 < 0.1%
 
13529 1 < 0.1%
 

mobile_likes_received
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED
ZEROS
Distinct count2004
Unique (%)2.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean84.1204912982435
Minimum0
Maximum138561
Zeros30003
Zeros (%)30.3%
Memory size773.6 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median4
Q333
95-th percentile317
Maximum138561
Range138561
Interquartile range (IQR)33

Descriptive statistics

Standard deviation839.8894437
Coefficient of variation (CV)9.984362083
Kurtosis15522.64932
Mean84.1204913
Median Absolute Deviation (MAD)4
Skewness107.5312999
Sum8328181
Variance705414.2777
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0 30003 30.3%
 
1 8243 8.3%
 
2 4948 5.0%
 
3 3608 3.6%
 
4 2944 3.0%
 
5 2383 2.4%
 
6 2022 2.0%
 
7 1745 1.8%
 
8 1521 1.5%
 
9 1437 1.5%
 
Other values (1994) 40149 40.6%
 
ValueCountFrequency (%) 
0 30003 30.3%
 
1 8243 8.3%
 
2 4948 5.0%
 
3 3608 3.6%
 
4 2944 3.0%
 
ValueCountFrequency (%) 
138561 1 < 0.1%
 
131244 1 < 0.1%
 
89911 1 < 0.1%
 
73333 1 < 0.1%
 
43410 1 < 0.1%
 

www_likes
Real number (ℝ≥0)

ZEROS
Distinct count1726
Unique (%)1.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean49.96242538104906
Minimum0
Maximum14865
Zeros60999
Zeros (%)61.6%
Memory size773.6 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q37
95-th percentile208
Maximum14865
Range14865
Interquartile range (IQR)7

Descriptive statistics

Standard deviation285.5601519
Coefficient of variation (CV)5.715498191
Kurtosis449.1484832
Mean49.96242538
Median Absolute Deviation (MAD)0
Skewness16.91102529
Sum4946430
Variance81544.60033
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0 60999 61.6%
 
1 4697 4.7%
 
2 2760 2.8%
 
3 1948 2.0%
 
4 1419 1.4%
 
5 1202 1.2%
 
6 1081 1.1%
 
7 897 0.9%
 
8 792 0.8%
 
9 757 0.8%
 
Other values (1716) 22451 22.7%
 
ValueCountFrequency (%) 
0 60999 61.6%
 
1 4697 4.7%
 
2 2760 2.8%
 
3 1948 2.0%
 
4 1419 1.4%
 
ValueCountFrequency (%) 
14865 1 < 0.1%
 
12903 1 < 0.1%
 
11077 1 < 0.1%
 
10763 1 < 0.1%
 
10627 1 < 0.1%
 

www_likes_received
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED
ZEROS
Distinct count1636
Unique (%)1.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean58.56883124753795
Minimum0
Maximum129953
Zeros36864
Zeros (%)37.2%
Memory size773.6 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median2
Q320
95-th percentile227
Maximum129953
Range129953
Interquartile range (IQR)20

Descriptive statistics

Standard deviation601.416348
Coefficient of variation (CV)10.26853934
Kurtosis23812.2491
Mean58.56883125
Median Absolute Deviation (MAD)2
Skewness126.257317
Sum5798490
Variance361701.6237
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0 36864 37.2%
 
1 8513 8.6%
 
2 5111 5.2%
 
3 3586 3.6%
 
4 2828 2.9%
 
5 2317 2.3%
 
6 1918 1.9%
 
7 1602 1.6%
 
8 1445 1.5%
 
9 1373 1.4%
 
Other values (1626) 33446 33.8%
 
ValueCountFrequency (%) 
0 36864 37.2%
 
1 8513 8.6%
 
2 5111 5.2%
 
3 3586 3.6%
 
4 2828 2.9%
 
ValueCountFrequency (%) 
129953 1 < 0.1%
 
62103 1 < 0.1%
 
39605 1 < 0.1%
 
39213 1 < 0.1%
 
34039 1 < 0.1%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

Sample

First rows

useridagedob_daydob_yeardob_monthgendertenurefriend_countfriendships_initiatedlikeslikes_receivedmobile_likesmobile_likes_receivedwww_likeswww_likes_received
020943821419199911male266.000000000
11192601142199911female6.000000000
220838841416199911male13.000000000
312031681425199912female93.000000000
41733186144199912male82.000000000
51524765141199912male15.000000000
61136133131420001male12.000000000
7168036113420001female0.000000000
8136517413120001male81.000000000
9171256713220002male171.000000000

Last rows

useridagedob_daydob_yeardob_monthgendertenurefriend_countfriendships_initiatedlikeslikes_receivedmobile_likesmobile_likes_receivedwww_likeswww_likes_received
989931654565191519948male394.04538414445011508844355961669127
98994206300620419931female402.01988332735110602572487333310332692
989951132164209199310female699.03611973450777684414690993859
989961668695242519894female182.0293812726018177655843117081756057
9899714589852814198512female290.022181618462610268429042503366018
98998126829968419454female541.021183413996180893505118874916202
989991256153181219953female21.01968172044011341243991059222820
990001195943151019985female111.0200215241195912554119591146201092
990011468023231119904female416.0256018545066516450657600756
990021397896391519745female397.020497689410124439410953002913